On Time Optimal Supernode Shape
نویسندگان
چکیده
With the objective of minimizing the total execution time of a parallel program on a distributed memory parallel computer, this paper discusses the selection of an optimal supernode shape of a supernode transformation (also known as tiling). We assume that the communication cost is dominated by the startup penalty and therefore, can be approximated by a constant. We identify three parameters of a supernode transformation: supernode size, relative side lengths, and cutting hyperplane directions. For algorithms with perfectly nested loops and uniform dependencies , we give a closed form expression for an optimal linear schedule vector, and a necessary and suucient condition for optimal relative side lengths. We prove that the total running time is minimized by cutting hyperplane direction matrix whose rows are from the surface of the polar cone of the cone spanned by dependence vectors, also known as tiling cone. The results are derived in continuous space and should for that reason be considered approximate.
منابع مشابه
On Optimal Size and Shape of Supernode Transformations
| Supernode transformation has been proposed to reduce the communication startup cost by grouping a number of iterations in a perfectly nested loop with uniform dependencies as a supern-ode which is assigned to a processor as a single unit. A supernode transformation is speciied by n families of hyperplanes which slice the iteration space into parallelepiped supernodes, the grain size of a supe...
متن کاملOn Supernode Transformation with Minimized Total Running Time
With the objective of minimizing the total execution time of a parallel program on a distributed memory parallel computer, this paper discusses how to nd an optimal supernode size and optimal supernode relative side lengths of a supernode transformation (also known as tiling). We identify three parameters of supernode transformation: supernode size, relative side lengths, and cutting hyperplane...
متن کاملExpediating IP lookups with reduced power via TBM and SST supernode caching
0140-3664/$ see front matter 2009 Elsevier B.V. A doi:10.1016/j.comcom.2009.10.006 * Corresponding author. E-mail addresses: [email protected] (Y. Zhang) [email protected] (W. Lu), [email protected] (L. Duan), s In this paper, we propose a novel supernode caching scheme to reduce IP lookup latencies and energy consumption in network processors. In stead of using an expensive TCAM based scheme, we imp...
متن کاملMatching-based preprocessing algorithms to the solution of saddle-point problems in large-scale nonconvex interior-point optimization
Interior-point methods are among the most efficient approaches for solving large-scale nonlinear programming problems. At the core of these methods, highly ill-conditioned symmetric saddle-point problems have to be solved. We present combinatorial methods to preprocess these matrices in order to establish more favorable numerical properties for the subsequent factorization. Our approach is base...
متن کاملData Parallel Code Generation for Arbitrarily Tiled Loop Nests
Tiling or supernode transformation is extensively discussed as a loop transformation to efficiently execute nested loops onto distributed memory machines. In addition, a lot of work has been done concerning the selection of a communication-minimal and a scheduling-optimal tiling transformation. However, no complete approach has been presented in terms of implementation for non-rectangularly til...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Parallel Distrib. Syst.
دوره 13 شماره
صفحات -
تاریخ انتشار 1999